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Abstract 

In  many  different  spatial  discrimination  tasks,  such  as  in  determining  the  sign  of 
the  offset  in  a  vernier  stimulus,  the  human  visual  system  exhibits  hyperacuity- 
level  performance  by  evaluating  spatial  relations  with  the  precision  of  a  fraction 
of  a  photoreceptor’s  diameter.  We  propose  that  this  impressive  performance 
depends  in  part  on  a  fast  learning  process  that  uses  relatively  few  examples 
and  occurs  at  an  early  processing  stage  in  the  visual  pathway.  We  show  that 
this  hypothesis  is  plausible  by  demonstrating  that  it  is  possible  to  synthesize, 
from  a  small  number  of  examples  of  a  given  task,  a  simple  (HyperBF)  network 
that  attsdns  the  required  performance  level.  We  then  verify  with  psychophysical 
experiments  some  of  the  key  predictions  of  our  conjecture.  In  particular,  we 
show  that  fast  stimxilus- specific  learning  indeed  takes  place  in  the  hmnan  visual 
system  and  that  this  learning  does  not  transfer  between  two  slightly  different 
hyperacuity  tasks. 
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For  any  given  visual  competence,  it  is  tempting  to  conjecture  a  specific  algorithm 
and  a  corresponding  neural  circuitry.  It  has  been  often  implicitly  assumed  that  this 
machinery  may  be  hardwired  in  the  brain.  This  extreme  point  of  view,  if  taken  se¬ 
riously,  may  quickly  lead  to  absurd  consequences.  Consider  for  instance  the  many 
different  hyperacuity  tasks, ^  some  of  which  are  outlined  in  Figure  1.  Computational 
analysis  reveals  that  the  photoreceptor  spacing  and  the  low-pass  characteristics  of  the 
eye’s  optics  satisfy  (in  the  fovea)  the  constraints  of  the  sampling  theorem.^  Thus,  the 
underlying  reason  for  the  spectacular  performance  of  human  subjects  in  the  hyperacu¬ 
ity  tasks  is  that  the  signal  sampled  by  the  photoreceptors  and  relayed  to  the  brain 
contains  the  information  necessary  for  precise  localization  of  image  features.  This  ob¬ 
servation,  however,  does  not  constitute  an  explanation  of  hyperacuity,  since  each  of 
a  variety  of  hyperacuity  tasks  is  different  and,  in  principle,  would  require  a  different 
circuit  for  its  solution.  Note  that  the  idea  of  a  fine-grid  reconstruction  of  the  image 
in  some  layer  of  the  cortex*  does  not  address  the  problem,  because  it  still  requires 
a  homunculus  looking  at  the  reconstructed  image  and  applying  a  different  routine  or 
circuitry  for  each  specific  hyperacuity  task. 

We  propose  instead*  that  the  brain  may  be  able  to  synthesize  -  possibly  in  the 
cortex  -  appropriate  task-specific  modules  that  receive  input  from  retinotopic  cells 
and  learn  to  solve  the  task,  after  a  short  training  phase  in  which  they  are  exposed 
to  examples  of  the  task.  To  show  the  plausibility  of  our  argument,  we  first  describe 
a  model  that  learns  to  solve  vernier  acuity  tasks  from  a  few  examples.  Synthesizing 
a  module  from  examples  for  a  specific  computational  task  may  be  often  regarded  as 
approximating  a  multivariate  function  from  sparse  data.  We  have  chosen  to  use  for 
function  approximation  the  HyperBF  network  technique.*  Other  schemes,  such  as  the 
popular  Multilayer  Perceptrons  or  more  traditional  classification  techniques,®  could 
probably  be  used  as  well.  In  our  model  we  take  the  extreme  view  that  the  inputs 
are  photoreceptor  activities,  to  demonstrate  the  plausibility  of  low-level,  or  “early” 
learning.  Biologically,  it  may  be  more  reasonable  to  Jissume  that  the  input  to  the 
learning  stage  is  provided  by  the  circiilax  center- surround  and  oriented  cells  in  VI.® 

In  the  simulated  experiments,  the  learning  modiile  was  given  an  array  of  “pho¬ 
toreceptor”  cell  activities  that  corresponded  to  the  input  image  blurred  by  the  eye’s 
optics.  There  were  eight  “receptors”,  positioned  randomly  on  a  loose  4x2  grid  (see 
Figure  2).  Each  of  the  inputs  was  calculated  by  integrating  the  image  over  the  point 
spread  function  of  the  optics,  approximated  by  a  Gaussian  of  spatial  extent  <7  =  30” 
and  time  extent  <rt  =  0.5  units.  The  simulated  “retinal”  patch  had  spatial  dimensions 
of  180”  X  360”  amd  a  time  dimension  of  3  units.  The  8-component  vector  of  receptor 
outputs  constituted  the  input  to  the  HyperBF  module,  which  was  trained  to  produce 
an  output  of  -|-1  for  one  sense  of  the  input  vernier  displacement,  and  —1  for  the  other 
over  a  set  of  examples  of  verniers  randomly  placed  relative  to  the  photoreceptor  array.^ 
The  performance  of  the  module  was  estimated  by  measuring  the  error,  defined  as  the 
absolute  value  of  the  difference  between  the  actual  output  and  the  desired  output. 
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Figure  1:  Examples  of  five  tasks  in  which  human  subjects  perform  at  hyperacuity  levels 
(that  is,  exhibit  resolution  finer  than  the  spacing  between  individual  photoreceptors). 
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Figure  2:  Fig.  2a  shows,  superimposed  on  the  vernier  stimulus,  the  mosaic  of  receptive 
fields  of  “cells”  assumed  to  provide  the  input  to  the  HyperBF  module  shown  in  (b). 
Each  receptive  field  is  depicted  as  a  circle  that  refers  to  the  point  spread  function  of 
the  optics.  Our  simulation  is  robust  with  respect  to  positioning  the  “cells”  at  precisely 
defined  locations  and  to  their  receptive  field  properties.  The  network  in  (b)  is  equivalent 
to  equation  1.^  ,  ~ 
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which  is  a  good  analog  of  acuity  threshold.®  Another  measure  of  performance  that  we 
have  considered  is  the  percentage  of  correct  responses  (that  is,  responses  in  which  the 
sign  of  the  module’s  output  agreed  with  the  sign  of  the  vernier  displacement,  as  defined 
during  training). 

The  HyperBF  module  learned  to  solve  the  vernier  task  at  a  hyperacuity  level  fron 
a  few  examples.®  The  time  course  of  the  learning,  illustrated  in  Figure  3a,  shows  that 
the  output  classification  error  rate  came  within  10%  of  its  asymptotic  value  after  just 
five  examples.®  All  in  aU,  the  model  replicated®  several  findings  in  the  psychophysics 
of  spatial  acuity:  (a)  hyperacuity-level  performance,  (b)  improvement  in  the  threshold 
with  increasing  length  of  the  two  segments  comprising  the  vernier  stimulus;^  (c)  de¬ 
terioration  of  performance  with  increasing  orientation  difference  between  training  and 
testing  trials;^®  (d)  high  performance  for  moving  verniers;^  and  (e)  performance  at  a 
similar  level  for  another  hyperacuity  task,  the  three-point  bisection,  after  learning  from 
suitable  examples. 

The  model’s  success  demonstrates  the  plausibility  of  the  hypothesis  that  learning 
of  hyperacuity  tasks  takes  place  early  in  the  visual  pathway.  A  more  critical  test 
is  provided  by  the  predictions  that  learning  of  a  hyperacuity  task  should  be  fast  (see 
Figure  3a)  and  may  not  transfer  even  to  a  slightly  different  hyperacuity  task  (Figure  3b 
shows  that  the  HyperBF  model  indeed  exhibits  no  transfer  of  learning  between  vertical 
and  horizontal  verniers).®  We  set  out  to  verify  experimentally  these  predictions  for 
human  hyperacuity  performance.  The  results  of  the  psychophysical  experiments  have 
borne  out  the  predictions  of  the  model.  First,  the  vernier  threshold  and  the  error 
rate  in  naive  subjects  improved  quickly  over  a  few  tens  of  trials  (Figure  4a).  Second, 
the  subjects  exhibited  no  trsmsfer  of  learning  between  the  vertical  vernier  and  the 
horizontal  vernier  tasks  or  vice  versa  (Figure  4b).  In  additional  experiments  (not 
shown),  there  was  no  significant  interocular  transfer  of  learning,  and  little  transfer 
from  a  position  10°  up  in  the  visual  field  to  a  position  of  similar  eccentricity  dow  n  in 
the  visual  field  (or  vice  versa).^’ 

Our  findings  pertaining  to  fast  stimulus- specific  learning  can  be  viewed  in  a  wider 
perspective  that  encompasses  the  issue  of  perceptual  learning  in  general.  A  promi¬ 
nent  example  is  provided  by  the  work  of  Fiorentini  and  Berardi^®  who  demonstrated 
stimulus- specific  learning  effects  in  the  discrimination  of  mixed  spatial  frequency  grat¬ 
ings  that  suggested  the  involvement  of  an  early-stage  mechanism.  Similar  to  our  case, 
they  found  that  learning  did  not  transfer  between  different  orientations  of  the  grating. 
SSHtf  also  found  that  there  was  interocular  transfer  of  learning  but  little  transfer  across 
retiiftd  focalions.  Kami  and  Sagi^®  recently  described  a  texture  discrimination  task  in 
^V^h  the  subjects  showed  stimulus- specific  learning  effect  that  did  show  interocular 
transfer  but  did  not  transfer  either  across  orientations  or  across  positions.  Other  simi¬ 
lar  instances  of  specific  perceptual  learning  had  been  reported  even  earlier.*®  Plasticity 
early  in  the  visual  pathway  has  been  demonstrated  experimentally*®  and  could  provide 
.  the  adaptive  mechanisms  required  by  a  module  of  the  HyperBF  type. 
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Figure  3;  (a)  shows  the  time  course  of  learning  by  a  HyperBF  module  given  the  input 
shown  in  Fig.  2a  (vertical  verniers  appearing  at  random  positions,  with  random  offsets 
in  a  certckin  range).  Each  block  in  this  simulation  consists  of  just  one  trial;  the  ordinate 
shows  percentage  of  correct  responses  in  each  block  (mean  and  ±1  standard  error 
over  30  simtdation  runs),  (b)  shows  the  effect  of  changing  stimulus  orientation  &om 
vertical  to  horizontal  at  block  20:  there  is  no  transfer  of  learning,  as  expected,  since  the 
examples  used  by  the  network  correspond  to  very  different  patterns  of  activation  of  the 
photoreceptors  in  the  two  cases.  Feedback  was  provided  in  these  simulations  (but  is  not 
strictly  required  by  the  learning  algorithm),  (c)  shows  responses  of  the  four  HyperBF 
centers  (acquired  during  an  incremental  learning  session  that  consisted  of  150  trials) 
vs.  the  offset  of  a  vertical  vernier  presented  at  a  fixed  location.  During  learning,  the 
offsets  were  uniformly  distributed  between  4  and  12  pixels.  The  response  was  tested 
with  vertical  verniers  shown  at  the  same  location  and  having  an  offset  ranging  from 
—20  to  20  pixels.  This  illustration  may  be  regarded  as  a  recording  of  the  receptive 
fields  of  the  centers  in  the  space  of  possible  inputs.  Of  the  four  centers,  one  responded 
strongly  to  positive  offsets  and  weakly  to  negative  ones,  another  one  preferred  negative 
offsets,  and  the  other  two  had  no  clear  preference  for  any  offset  sign.  An  appropriate 
response  representing  the  sign  of  the  offset  may  be  formed  at  the  output  level  of  the 
HyperBF  module,  using  the  responses  of  the  sign-selective  centers. 
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Figure  4:  Psychophysical  experiments  corresponding  to  the  simulations  of  Fig.  3:  (a) 
shows  the  time  course  of  learning  in  a  vernier  task.  A  fast  initial  component  is  clear. 
Data  ar^  means  of  six  subjects;  vertical  bars  represent  standard  errors.  Each  block 
consisted  of  40  trials.  We  found  similar  learning  effects  with  verniers  consisting  of 
three  dots  rather  than  of  two  bnes.  (b)  shows  the  effect  of  switching  &om  vertical  to 
horizontal  verniers  (or  vice  versa)  after  block  20.  Averaged  results  of  12  subjects;  six 
started  with  horizontal  verniers,  the  others  started  with  vertical  verniers.  There  is  no 
transfer  of  learning.  In  these  experiments  feedback  was  provided  to  the  subjects. 
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Our  computational  and  psychophysical  results  support  the  conjecture  that  the  mod¬ 
ules  responsible  for  hyperacuity-level  performance  are  synthesized  early  in  the  visual 
pathway  in  a  demand-driven  fashion,  when  the  appropriate  task  is  first  performed  by 
the  subject.  Related  evidence  regarding  perceptual  learning  mentioned  above  suggests 
that  the  same  line  of  reasoning  can  be  appbed  to  visual  tasks  other  than  hyperacuity, 
and  even  to  faculties  other  than  vision.^’*^  Importantly,  learning  HyperBF  interpola¬ 
tion  can  be  implemented  in  a  simple  biologically  plausible  network.^’'*  The  proposal 
that  much  of  the  information  processing  in  the  brain  is  performed  by  mechanisms  re¬ 
lated  to  the  HyperBF  modules  acting  as  enhanced  look-up  tables  may  bridge  apparently 
conflicting  paradigms,  such  as  Gibson’s  immediate  perception  and  Marr’s  representa¬ 
tional  theory,  since  appropriately  encoded  icons  or  “snapshots”  of  the  world  appear 
to  allow  the  synthesis  of  computational  mechanisms  effectively  equivalent  to  vision 
algorithms  for  tasks  ranging  Horn  hyperacuity  to  object  recognition.^* 
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/*(*)=  II  w)  +  p(x)  (1) 

a=:l 

where  the  parameters  to  that  correspond  to  the  centers  of  appropriate  basis 
functions  G  (such  as  the  Gaussian),  and  the  coefficients  are  unknown,  and  are 
in  general  much  fewer  than  the  data  points  (n  <  N).  The  norm  is  a  weighted 
norm 


l|3«:-tall»r  =(x-ta)^W^^W(x-to)  (2) 

where  W  is  an  unknown  square  matrix  and  the  superscript  T  indicates  the  trans¬ 
pose. 

The  network  of  Fig.  2b  corresponds  exactly  to  equation  1.  Its  interpretation  is 
the  following.  The  centers  of  the  basis  functions  are  similar  to  prototypes  (see 
Fig.  3c),  since  they  are  points  in  the  multidimensional  input  space.  Each  unit 
computes  a  (weighted)  distance  of  the  inputs  from  its  center  and  applies  to  it 
the  radial  function.  In  the  case  of  the  Gaussian,  a  unit  will  be  the  most  active 
when  the  input  exactly  matches  its  center.  The  output  of  the  network  is  a  linear 
superposition  of  the  activities  of  all  the  basis  functions,  plus  direct,  weighted 
connections  from  the  inputs  (the  linear  terms  of  p(x))  and  from  a  constant  input 
(the  constant  term).  Notice  that  in  the  limit  case  of  the  basis  functions  approxi¬ 
mating  delta  functions,  the  system  becomes  equivalent  to  a  look-up  table  holding 
the  examples. 

The  parameters  c,t,W  are  searched  for  during  learning  by  minimizing  an  error 
functional  defined  as 


8 


where 


N 


n\r]  =  Hc.t.w  = 

1=1 


Ai  =  y,  -  /’(x)  =  y.-  -  ^  CaG(||x,  -  tallw)- 

a=l 

Thus  learning  in  the  HyperBF  network  corresponds  to  finding  values  of  parame¬ 
ters  that  minimize  H.  Iterative  methods  of  the  gradient  descent  type  can  be  used 
for  the  minimization  of  H .  An  even  simpler  method  that  does  not  require  calcula¬ 
tion  of  derivatives  is  to  look  for  random  changes  (controlled  in  appropriate  ways) 
in  the  parameter  values  that  reduce  the  error.  In  the  simulations  described  in 
this  paper  the  model  was  endowed  with  a  dual  incremental  learning  mechsinism. 
First,  when  the  model’s  performance  on  a  new  input  was  markedly  inadequate 
(in  comparison  with  recent  history),  that  input  was  adjoined  to  the  model  as  an 
additional  center  (prototype).  This  happened  mainly  in  the  initial  trials,  with 
the  number  of  centers  eventually  reaching  an  asymptote  that  depended  on  the 
nature  of  the  task  and  on  the  parameters  that  affected  the  decision  to  add  new 
centers.  The  performance  of  the  model  during  these  first  trials  improved  quickly, 
then  stabilized  as  the  number  of  centers  asymptoted.  Second,  further  gradual 
improvement  in  the  performance  was  obtained  by  letting  the  model  carry  out  a 
local  random  search  in  the  space  of  existing  HyperBF  center  coordinates.  This 
search  was  gmded  by  feedback  given  to  the  model  (that  is,  by  indicating  whether 
the  response  at  each  trial  was  correct).  Details  of  the  learning  algorithms,  includ¬ 
ing  an  extension  of  the  incremental  learning  algorithm  to  a  situation  in  which  no 
explicit  feedback  is  available,  can  be  found  in  Poggio,  Fahle  &  Edelman,  MIT  AI 
Memo  1271,  1991  and  Weiss,  Edelman,  Fahle  &  Poggio  Wetzmann  CS-TR  21, 
1991. 
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